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Abstract: Digital data is everywhere, and its ubiquity is causing profound changes in our 
personal lives and in the functions of government, business, and academia. Organizations 
of all sizes and purposes are seeking to take advantage of the big data tsunami and the 
opportunities it presents. RTI International, a non-profit U.S. research organization, draws 
more than 80 percent of its $760 million in annual revenues from competitive grants and 
contracts funded by the U.S. government. The organization is rich in talent and expertise 
but not currently aligned in a way that meets big data’s challenges. To thrive in this rapidly 
changing environment, RTI must determine how to seize opportunities big data presents, 
survive the threats posed by big data, and offer its clients expanded services. How well RTI 
responds to these challenges will determine its role in the search for solutions to the major 
social and scientific problems of our day. 
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Forward-thinking organizations and their leaders around the world are wrestling with the 
big data revolution” and its impact on their businesses. Some firms are mining mountains 
of process data to fine-tune manufacturing operations. Others are sifting through Facebook 
postings and Twitter feeds to understand customer sentiments about their products and services. 
Software vendors are marketing sector-specific applications — ” insight solutions” — to assist 
in this process. Meanwhile, high-tech companies are collaborating with consumer-ratings 
groups on mobile applications that use government-financed research to compare the safety 
and effectiveness of various medications and offer the best recommendation to a patient 
(Comstock, 2013). To help organizations store, aggregate, and retrieve these ever-expanding 
forms of information, cloud computing vendors are reshaping the data storage landscape. 

In 2010, more than four billion people, or 60 percent of the world’s population, were using 
mobile phones (Manyika et al., 2011). About 12 percent of them were using smartphones, a 
percentage that’s growing more than 20 percent per year. Big data is estimated to represent 
potentially $300 billion to the U.S. healthcare system through greater efficiency and delivered 
value. Global personal location data, which can quickly identify the most efficient route from 
Point A to Point B, represents as much as $100 billion in revenue for service providers. 

Since 2000, the amount of information collected by the federal government has increased 
at a mind-boggling rate (TechAmerica Foundation, 2012). In 2009, the federal government 
produced 848 petabytes of data, and U.S. healthcare data on its own reached 150 exabytes. 
Five exabytes of data would contain all of the words ever spoken by human beings on earth. 
As big data permeates every aspect of how we live and conduct business, most organizations 
fear standing on the sidelines while others figure out how to use big data to their advantage. 
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In response to these and numerous other developments, many organizations are attempting 
to prepare themselves to take advantage of the big data tsunami and the opportunities it 
presents. Some are opening new divisions or setting up ’predictive analytics” initiatives. In 
many cases, this involves hiring new talent — data scientists and other professionals — to help 
lead their efforts. While data-savvy companies may be on the hunt for the next Nate Silver, 
the statistical whiz who correctly predicted the outcome of the 2008 and 2012 presidential 
elections, their efforts may be hindered by dire predictions of a looming shortage of qualified 
data analysts and managers (Davenport & Patil, 2012; Rooney, 2012). 

RTI International, a survey research firm, is an organization that is rich in talent and 
expertise but not currently aligned in a way that meets big data’s challenges. To thrive in this 
new environment, we must address questions such as: 

¢ How do we re-calibrate technological savvy and subject matter expertise in order to 

meet emerging business opportunities associated with big data? 

¢ How do we organize staff so that subject matter experts and more data-savvy junior 

staff can easily share their expertise across disciplines? 

¢ How do we avoid investing resources in massive projects (the ”build it and they will 

come” mentality) before truly understanding clients’ needs and requirements? 

Our article aims to share a practitioner’s perspective on the challenges of restructuring 
a knowledge-worker company in the midst of the big data revolution. In some cases, these 
challenges include retooling fundamental human resource processes such as recruiting and 
hiring, performance management, and talent development. As organizations begin to orient 
themselves and their workforce to meet big data’s demands, new areas of opportunity are 
likely to emerge, but the path forward will not be clear and straightforward. After briefly 
describing RTI International, we discuss the main challenges to the organization posed by 
big data. Based on RTT’s experiences to date, we derive several implications for organizations 
in general. 


RTI INTERNATIONAL: BACKGROUND, MISSION, AND 
CAPABILITIES 


RTI International is an independent, nonprofit research institute that provides research, 
development, and technical services to government (local, state, and federal) and commercial 
clients worldwide. Founded in 1958 with the creation of North Carolina’s Research Triangle 
Park, RTI was conceptualized as a partnership between North Carolina’s business leaders, 
the state government, and the region’s three major universities (University of North 
Carolina-Chapel Hill, North Carolina State University, and Duke University). With a mission 
of improving the human condition by turning knowledge into practice, RTI leverages its 
research and technical capabilities to solve critical social and scientific problems. 

The company’s annual revenues are approximately $760 million, with over 80 percent of 
this sum derived from competitive grants and contracts funded by the U.S. government. More 
than 3,700 staff from 250 scientific and technical disciplines work in eight U.S.-based and ten 
international offices. Staff members carry out approximately 1,800 funded research projects 
annually, many of which result in peer-reviewed publications and/or adjudicated government 
statistical reports. 

RTI is organized into four business units: 

¢ Social, Statistical, and Environmental Sciences — Program areas include criminal 
justice and behavioral health, environmental sciences, statistics and epidemiology, 
and survey and computing sciences. 

e« Discovery Science and Technology — Program areas include energy technology, 
materials and electronic technology, organic and medicinal chemistry, pharmacology 
and toxicology, biomarkers and systems biology, and analytical chemistry and 
pharmaceutics. 

¢ International Development — Program areas include governance and economic 
development, international education, and global health. 

¢ Health Solutions — A specialized group focused entirely on services for the 
pharmaceutical and biotechnology sectors. 
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The Social, Statistical, and Environmental Sciences (SSES) business unit, led by the 
first author, is RTI’s largest, with approximately $385 million in annual revenue, 1,500 
professional staff, and 1,200 temporary data collectors. Gabel’s role is to set the strategic 
direction for the group and ensure that it is appropriately organized to face the future. 


BIG DATA’S CHALLENGES TO RTI INTERNATIONAL 


Big data’s transformative nature presents unique challenges to RTI, whose organizational 
structure promotes deep subject matter expertise and research capacity. While this structure 
has yielded significant benefits, it can inhibit the type of cross-disciplinary innovation that big 
data demands. For RTI to become a stronger organization in the future, it must successfully 
address three large challenges posed by the big data phenomenon. 


Seize Opportunities 


RTI International was founded more than 50 years ago and has long been organized 
along traditional scientific/academic disciplines (e.g., economics, statistics, engineering, 
chemistry). Even when reorganizations have taken place, the cultural affinities of staff tend to 
run along disciplinary lines. Likewise, RTI’s human resource system (e.g., job descriptions, 
job families and functions, job expectations and promotion criteria) mirror that of academic 
disciplines. For example, the most common job titles in SSES, and the number of people 
holding each job, are: Public Health (335), Economics/Health Economics (197), Systems 
Analysis and Programming (196), Statistics (156), Survey Methodology/Operations (149), 
Education (126), Environmental Science (95), and Epidemiology (47). Nearly 40 percent 
of our technical staff have terminal degrees, and 65 percent have at least a master’s degree. 
About 53 percent of RTI’s workforce is female, and 29 percent are 35 years of age or younger. 
Average employee tenure is just under eight years. 

Our well-tested and successful business model uses a matrix management approach, 
assembling cross-department teams for specific proposal and project efforts. Over time, this 
approach has produced a culture of specialization and its attendant demands and rewards. 
Despite the successes that this model has brought to RTI in the past, we recognize that it 
is unlikely to succeed in the era of big data, which demands the blending of disciplines, 
especially across the boundaries of subject matter, statistics, and computing. We are trying 
to come to grips with the best way to organize ourselves to address the challenges and 
opportunities that big data will present. 

Yet it is not obvious how to go about that. RTI’s leadership and business units are seeking 
how to address the challenges of big data in the clearest and most realistic manner. Our 
statistics group sees it primarily as an analytics issue, with a clear emphasis on their statistical 
sampling expertise. Our technology group sees it as a high-performance computing problem. 
And our subject matter experts see enormous potential in the power of virtually unlimited 
sample sizes. RTI’s legal office and Institutional Review Boards see exponential increases 
in data privacy risks. Not surprisingly, each views the big data phenomenon from its unique 
functional perspective. Fortunately, we are in a position to leverage our talent to capitalize 
on the opportunities big data presents, but it will require us to break down silos, revise our 
job descriptions and hiring practices to attract staff with blended skills, and re-think the way 
we create our project teams. To begin this process, SSES initiated a re-organization this year 
that more tightly coordinates our data, computing, statistical, and analytics resources. In one 
set of moves, staff from our computing, biostatistics, and epidemiology departments were 
merged into a single new research center with a defined focus on managing health research 
data. In another move, a set of sampling statisticians, programmers, and data managers were 
combined to form a new data science and statistical methods center. While seemingly modest, 
both announced changes were perceived as controversial, as they moved researchers out of 
traditional discipline-focused departments. 

In organizations like RTI, such changes can elicit a strong, and even emotional, response 
from staff. In Gabel’s judgment, this is because it strikes at the core of employees’ professional 
identity, especially if the corporate culture has long valued and celebrated a disciplinary 
focus. Employees attach great meaning to the name of the organizational unit in which 
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they reside. Moving a statistician out of the “statistics department” and into the “health 
data management” or “energy analytics” department raises questions around career path and 
professional growth, job performance expectations, access to appropriate mentoring, and 
the professional risk of overly narrow specialization. These are all reasonable concerns for 
employees when they are wondering, “What do these changes mean for me and who I am?” 

To help overcome such concerns, our reorganization included a robust communication and 
change-management plan. Town hall and small-group meetings addressed concerns head-on, 
and staff members helped develop and review responses to frequently asked questions. At 
every step, we tried to ensure that we could answer the “Why are we doing this” question 
with market data and explanations of how this would better prepare RTI to capture current 
and future market opportunities. The goal in adopting this approach is twofold: to more 
closely align our subject matter and data science expertise and to sustain those collaborations 
within RTI’s specific disciplines. In effect, our reorganization has built dotted-line bridges 
that connect RTI’s subject matter and data science experts as they create new communities 
while allowing experts to maintain their professional identities. 

We anticipate that our new organizational matrix will evolve as the market matures 
and our clients gain a deeper understanding of their business needs. We have adjusted our 
organizational matrix to support the needs of the market, such as with the creation of RTI’s 
multidisciplinary Center for the Advancement of Health IT in 2010. We will continue to 
make adjustments to become more market and customer-oriented in the future. 

To enhance our new organizational approach, RTI recently launched a customer 
relationship management platform called Salesforce.com. This robust tool, which is used 
by many organizations to support their sales and business development functions, includes a 
social media module called Chatter. Similar in look and feel to Facebook, Chatter allows RTI 
staff to form and communicate through cross-disciplinary groups around cutting-edge topics 
such as Big Data/Big Science, Global Health Informatics, Implementation Science, and 
Education and Workforce Development for 2025. Still in its early months of implementation, 
Chatter is proving to be a popular and time-efficient mechanism for building virtual expert 
communities within RTI. We are already seeing the organic formation of Chatter communities, 
including discipline-focused groups comprised of staff formerly in the same administrative 
unit. Of course, neither our re-organization nor our collaborative technology roll-out replaces 
the need for our domain experts to keep themselves up-to-date in their respective areas of 
expertise. We are actively encouraging our subject matter experts to collaborate on research 
projects and peer-reviewed publications across business silos, and to retain their membership 
and participation in professional societies and associations aligned to their expertise. We are 
also encouraging them to think about, study, and publish on the impact of big data on their 
discipline. 


Survive Threats 


RTI’s second big data challenge can be likened to the daunting task of “rebuilding the 
airplane engine in mid-flight.” As the largest business unit at RTI, SSES has a particular 
responsibility for maintaining our core capabilities while also positioning us for the future. In 
other words, we must find a way to maintain an engine that risks sputtering while operating at 
full capacity. The airplane engine is RTI’s survey research business. Much of SSES’s revenue 
comes from conducting scientifically rigorous, statistically representative survey projects 
(face-to-face household surveys, establishment surveys, and telephone surveys) for RTI’s 
federal government clients. As information from big data’s many sources is now available on 
a 24/7 basis, the intrinsic value of the statistically representative survey must be redefined. 
The advent of big data is forcing a paradigm shift in the federal government’s statistical 
system, as reflected in this blog post (http://directorsblog.blogs.census.gov/2011/05/31/ 
designed-data-and-organic-data/) by Robert Groves, the former Director of the U.S. Census 
Bureau: 


We're entering a world where data will be the cheapest commodity around, simply 
because the society has created systems that automatically track transactions of all 
sorts. For example, internet search engines build data sets with every entry, Twitter 
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generates tweet data continuously, traffic cameras digitally count cars, scanners record 
purchases, RFIDs signal the presence of packages and equipment, and internet sites 
capture and store mouse clicks. Collectively, the society is assembling data on massive 
amounts of its behaviors. Indeed, if you think of these processes as an ecosystem, it 
is self-measuring in increasingly broad scope. Indeed, we might label these data as 
“organic,” a now-natural feature of this ecosystem. 


In his article on the three eras of survey research, Groves (2011) distinguishes between 
“organic” data and “designed” data, or data collected via well-designed questionnaires in 
order to answer hypothesis-driven policy questions. Like the Census Bureau, RTI staff are 
experts at carrying out complex efforts to collect, manage, and analyze what Groves refers 
to as designed data. We are not, however, adequately prepared to change our core business 
in response to the onslaught of organic data generated by big data. Groves continues to be 
a thought leader on this subject. Now provost of Georgetown University, he welcomed the 
creation of the Massive Data Institute at the university’s newly formed McCourt School of 
Public Policy (Kerr, 2013). Groves noted the institute’s potential to capitalize on the explosive 
growth of quantitative public information through sites such as www.data.gov to help frame 
policy issues and train new generations of government leaders. Such training can also help 
address critical staffing needs at government agencies, where “just having the talent that can 
navigate these files” is sorely needed (Anderson, 2013). 

The formation of the Massive Data Institute and the potential it holds for addressing 
the pressing public policy and scientific issues of the 21st century signals that disruptive 
innovation to the established federal survey research business is just around the corner 
(Christiansen, 2013). Disruptive innovations are not uncommon, of course, and many 
industries have faced comparable or greater challenges. Big data represents a threat to one 
of RTI’s core competency areas as the federal survey research community faces increased 
competition with a “big data is faster and cheaper’ value proposition. Our response thus far 
to leverage knowledge from organic data has been a “skunkworks” approach, using internal 
resources to fund several multi-disciplinary R&D teams. These teams are charged with 
experimenting with new methods and publishing their results in the peer-reviewed literature 
to establish our market position and the credibility of our methods. Our teams are exploring 
the use of Twitter, Facebook, and other social media platforms as a source of scientifically 
valid data. 

Although this approach is more experimental andriskier than our traditional interdisciplinary 
model, it is nonetheless focused on establishing RTI’s credibility with new methods of data 
collection and analysis. We pay for these projects through R&D funds created specifically for 
this purpose, allowing teams to experiment without the financial pressures of business unit 
revenue projections. In RTI’s cost accounting system, having staff involved in these internal 
projects is “budget friendly” to line managers with profit/loss responsibility — it generates 
overhead to the business unit in the same way as externally funded research. Thus, a large 
component of the potential opportunity cost to the business unit is offset. However, we are 
fast approaching the transition point where our core competency of collecting data needs 
to be augmented, and eventually replaced, with a new competency of blending statistically 
representative (expensive) designed data with commodity (inexpensive) organic data while 
still drawing valid inferences suitable for public policymaking. Navigating this transition 
presents enormous challenges to our company. 

Those challenges exist on two levels. The first level is almost existential -- how to address 
(not necessarily change) the fundamental view of data validity held by many RTI staff. This 
view insists that data must be sampled, collected, analyzed, and reported using traditional, 
proven statistical methods. Efforts to change this model are viewed with suspicion and deep- 
seated fears that the resulting data analyses will not produce consistent, reliable results. We 
are attempting to counter these fears by encouraging staff to publish their work in traditional 
journals, thereby establishing the credibility of the new methods. The second challenge is 
more mechanical but no less daunting. We need to learn how to merge data from disparate 
sources which were not originally intended to be joined together. This requires that we create 
different ways of thinking about how to synthesize data so that it can address the critical data 
integrity questions of causality and generalizability. 
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Our close relationships with area universities are also helping to address these challenges. 
Through means like visiting scholars and sabbatical-type exchange programs, we are able 
to access researchers working on some of the foundational issues around high-performance 
computing and management of voluminous data sets. RTI is a founding member of the 
National Consortium for Data Science, headquartered at the University of North Carolina at 
Chapel Hill, which also provides us access to staff and expertise from IBM, SAS Institute, 
General Electric, Cisco, and other large organizations with numerous relevant capabilities. 
These types of partnerships and relationships are helping us accelerate our transformation 
and refine our expertise. 


Define What Clients Need — And Don’t Need — From Big Data 


RTI’s third big data challenge is helping our clients use new, as well as existing, sources 
of information to envision research questions that, until recently, had been impossible to 
quantify. This newly developed approach will integrate designed and organic data with 
subject matter and research expertise to produce focused insights that can better inform policy 
and decision support, predict how resources can be used in a more cost-effective manner, and 
put information in the hands of end-users in a more efficient, customized manner. 

Some long-standing federal government clients are beginning to make inroads in pursuit 
of these goals. For example, the National Institutes of Health announced this summer that it 
will fund up to $24 million per year for the next four years to establish investigator-initiated 
Big Data to Knowledge Centers of Excellence (National Institutes of Health, 2013). These 
centers are intended to help the research community use ever-larger and more complex 
datasets through development and distribution of innovative approaches, methods, and tools 
for data sharing and analysis. They will also provide training for students and researchers on 
data science methods. More broadly, RTI gained new insights into the big data capabilities 
and needs of our federal government clients at a recent event convened by the White House 
Office of Science and Technology Policy. Building on a $200 million Big Data Research and 
Development Initiative unveiled by the White House in 2012, the Obama Administration is 
now encouraging stakeholders including federal agencies, academia, nonprofit organizations, 
and state and local governments to participate in projects and initiatives that move big data 
from knowledge into action (Weiss & Zgorski, 2012). Key priorities include: 

e Advancing technologies that support advanced data management and data analytic 

techniques 

e Educating and expanding the data science workforce 

¢ Developing, demonstrating, and evaluating applications of data that can improve key 

outcomes in economic growth, job creation, education, health, energy, sustainability, 
public safety, science, and manufacturing 

¢ Fostering regional innovation. 

RTI is at an early stage of developing its approach to big data, and we are deliberate 
about how to best invest our human and capital resources. One approach we are unlikely to 
pursue is creating a big data architecture or platform to support current and future projects. 
Instead, we tend to side with conclusions of the TechAmerica Foundation report that suggest 
that successful big data initiatives, especially in the public sector, are tailored to a specific, 
narrowly defined business or mission requirement (TechAmerica Foundation, 2012). RTI’s 
experience in this arena supports an approach that integrates new and existing data to 
address focused research questions. For example, a research effort among RTI, the RAND 
Corporation, Structured Decisions Corporation, and the Washington, DC Metropolitan Police 
Department analyzed text data from 911 call transcripts to generate more precise forecasts 
of areas at elevated risk of specific types of crime. Collected by every police department 
in the U.S., 911 calls for emergency services data traditionally have been used to review 
the efficiency of response time and, among larger police departments, to better allocate 
resources. RTI researchers (with expertise in criminal justice and data analysis) developed a 
prototype software toolkit that can routinely process calls for services data and extract key 
characteristics or behaviors found in each call’s narrative comments. This approach makes 
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these data more flexible in helping to identify specific, short-term changes in a given location, 
such as an increase in the presence of certain types of drugs or disturbances. Using keywords 
or themes, the software’s detailed information can help police anticipate and respond more 
quickly to changes in certain types of criminal activities or predict more precisely which 
areas are likely to see an increase in criminal activity. Big data will play an increasingly 
important role in projects like this, as the next generation of 911 computer-aided dispatch 
(CAD) technology will allow individuals to submit information captured by video and text 
message. 

At an earlier stage of development is a project being co-funded by RTI and Duke 
University that is evaluating the effectiveness of massive open online courses, or MOOCs. 
Duke launched its first MOOC in conjunction with Coursera in September 2012 (Ferrari, 
2012). Since then, Duke has offered more than a dozen courses and is developing additional 
MOOCs that cover humanities, natural and biological sciences, social science, nursing, 
medicine, and engineering. Properly designed and effectively presented, MOOCs can bring 
the benefits of a world-class education to motivated individuals with little more than an 
Internet connection. Nonetheless, assessing the effectiveness of online teaching approaches 
is an important consideration for MOOCs to live up to their potential. Big data’s evaluation 
methods include in-depth tracking and analysis of online student learning activities, even 
down to the level of mouse clicks. This analysis can be performed across input generated 
from thousands of students instead of from data pulled from small studies (TechAmerica 
Foundation, 2012). RTI and Duke will not only be working on ways to mine and evaluate 
MOOC results but will also be conducting interviews with dozens of the largest employers 
in North Carolina to assess receptivity to hiring workers who have been educated using non- 
traditional methods. 


CONCLUSION 


As the challenges discussed above illustrate, RTI’s workforce needs will be transformed by 
the skill-set demands of big data. We anticipate a blurring of the lines among the disciplines 
of mathematics, statistics, computer science, and various subject matter areas. Meeting the 
demands of big data will require internal changes that range from education and training 
to executive leadership. Based on RTI’s experiences over the last six months, we offer for 
consideration some of our lessons learned thus far. 

(1) Establishing a robust communication plan around our reorganization was not 
sufficient. We quickly found it necessary to form a cross-organization steering committee, 
led by a senior executive, to involve staff and leaders who were not directly impacted by the 
reorganization. We did this in part to correct perceptions that not being included in one of 
the new groups implied not having a role to play in RTI’s big data future. Subject matter 
experts from across the company engaged quickly and enthusiastically to help shape the 
steering committee agenda, a process that expended more time than anticipated in meetings 
and discussions. Our decision to use a “big tent” approach to ensure broad buy-in has slowed 
us down. 

(2) The data analytics talent shortage in other organizations is impacting us sooner than 
expected. RTI’s corporate headquarters location of Research Triangle Park, North Carolina, is 
attracting new firms with a focus on analytics, and we have experienced voluntary departures 
from younger career staff who are realizing significant compensation increases. We are 
responding to these market forces, of course, but we clearly underestimated the degree to 
which our talent base would be targeted. We have increased our internal R&D funding and 
commissioned additional special projects aimed at keeping our top talent fully challenged 
and engaged. 

(3) Our corporate information technology group, traditionally more of an “order taker” 
than a “business consultative” group, is being assessed and realigned under a new CIO. 
RTI’s big data agenda is driving much of the assessment and revealing that our IT staffing 
mix may not be optimal for a big data world. We currently manage our own primary and 
secondary data centers, and will likely move toward outsourcing to private cloud vendors. 
Roles like “system administrator” or “data center manager” may need to be replaced by 
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“vendor management” or “business unit liaison.” Such changes will not only impact our 
job titles but will also ripple into our cost accounting and charge-back methodologies, thus 
adding to the complexity of our transformation. IT’s interconnectedness with the entire 
organization should not be underestimated. 

As RTI girds for the big data revolution, we recognize that the challenges we face are 
similar in some respects to those facing our clients. That is, big data will usher in significant 
changes both to our organization and the clients we serve. We recognize the urgency of 
capturing and formulating insights from big data at a time when they can enhance optimal 
decision making, both for our organization and for our clients. 
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